-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC007] Migrate the parser to the new AST #2083
Conversation
9655af8
to
751da65
Compare
8cedc56
to
2de6236
Compare
2de6236
to
f1da1dd
Compare
74f9906
to
eb338a1
Compare
Bencher Report
Click to view all benchmark results
|
0117944
to
a394c35
Compare
First stab at making the parser compatible with the new AST representation (`bytecode::ast::Ast`). This is a heavy refactoring which required to update most of `parser::uniterm` and `parser::utils` as well as `grammar.lalrpop`. The current version is far from compiling; fixing compiler errors is planned in follow-up work.
As we move toward a bytecode compiler and a bytecode virtual machine, we are replacing the left part of the pipeline with the new AST representation. The bytecode module was previously gated by an experimental feature, thea idea being that this feature would enable the whole bytcode compiler pipeline. However, for now, we only have a new AST representation, and it's being used in the mainline Nickel parser (and soon, in the typechecker, etc.). Thus we need access to the new AST representation by default, and it doesn't make much sense to gate it behind a feature. We'll reintroduce the feature once we have a prototype compiler and a bytecode virtual machine, when it will then make sense to use the feature to toggle between the legacy tree-walking interpreter and the new bytecode compiler.
…esolution for RepeatSep1)
a394c35
to
1274e47
Compare
Nickel AST conversion impact reportHere is a preliminary performance impact report of this change on small and larger examples. MethodologyI did an end-to-end run of Metrics gather runtime of part of the pipeline in milliseconds. Note that not all operations are measured, but the measures still cover most of the actual runtime. Here is what's been measured:
So, there are some inclusions here: FindingsIt's both surprising and interesting that my initial expectations were somehow not entirely right. AST conversion is taking a large part of the overall parsing, around 55-65% consistently across all example size. Since very small examples are dwarfed by preparing the stdlib, and thus dominated by parsing it, this looks bad at first: we take a big hit on parsing on small examples. However, it's not the case: when comparing with the version that doesn't perform AST conversion, this one actually sometimes perform better overall! In general the difference on small examples is very small and in the noise threshold (the order of a percent), and can be in both directions, both for the total runtime and when comparing parsing times. My interpretation is that on master, we're actually spending more than half of the parsing time allocating For bigger examples, the Mantis case is around +6% of parsing time, and +4.3% overall overhead. It's unique it that it has a very low evaluation time (it's almost a static config), and is thus still dominated somehow by parsing. The other large examples are heavily dominated by evaluation (more than 95%), and only take a few percent for parsing, so this PR is even less relevant for the overall performance. I suspect the difference we can see (like +3.3% on OPL v2) is rather due to variability, as it seems pure evaluation is taking a small hit but there's no reason for it. As some examples are a bit long to run, I haven't done averaging or warming so we can expect a bit of volatility. Hopefully it's still enough to validate that this change doesn't seem to have much performance impact (and that the new AST might get us parsing time down by around 50% once we don't have to perform the conversion anymore!). You'll find the detailed data below. Small size programsArrays exampleChange after introducing the new AST: -4,3% Before this PRruntime:eval: 12 (8.5%) total: 140 After this PRruntime:ast_conversion: 66 (49,2%) total: 134 Fibonacci exampleChange after introducing the new AST: +0% Before this PRruntime:eval: 44 (26.1%) total: 168 After this PRruntime:ast_conversion: 67 (39,9%) total: 168 GCC config exampleChange after introducing the new AST: +1% Before this PRruntime:eval: 13 (9.8%) total: 133 After this PRruntime:ast_conversion: 62 (45.5%) total: 136 Mid to large size programsChange after introducing the new AST: -0.2% Organist project.nclBefore this PRruntime:eval: 4161 (93.7%) total measured: 4440 After this PRruntime:ast_conversion: 142 (3.2%) total measured: 4434 Mantis benchmark8 Nickel files, 2356 kLoc Change after introducing the new AST: +4.6% Before this PRruntime:eval: 111 (13.4%) total measured: 827 After this PRruntime:ast_conversion: 473 (54%) total measured: 865 OPL config test v167 Nickel files, 25 kLoC Change after introducing the new AST: +1% Before this PRruntime:eval: 80008 (97.9%) total measured: 81678 After this PRruntime:ast_conversion: 454 (0.5%) total measured: 82242 OPL config test v2165 Nickel files, 80 kLoC Change after introducing the new AST: +3.3% Before this PRruntime:eval: 310946 (97.1%) total measured: 320220 After this PRruntime:ast_conversion: 3174 (0.9%) total measured: 321274 |
052eb46
to
e86bc0a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking forward to the perf improvements from deleting the old ast!
Co-authored-by: jneem <[email protected]>
Migrate the parser to the new AST
Following the step by step implementation of RFC07, this PR migrates the parser to output the new AST, The plan is to convert this to the old AST for the remaining of the pipeline (typechecking, transformations, and evaluation), and to measure if that conversion is noticeable on various type of examples (small, big, small but importing big contracts, libraries, etc.)
Reviewing
The diff is quite big, but a lot of it is mostly mechanical changes. In particular I'm not sure that reviewing the change of the
grammar.lalrpop
file is really that interesting. Changes tobytecode::ast
,bytecode::ast::compat
and other modules are probably worth looking into it, though. Forparser::utils
andparser::uniterm
, I'm not sure: most of it is mechanical, but wasn't entirely trivial either, in particular the gymnastic around type variable fixing. The latter might benefit from a pair of eyes.Perf impact
I've written a detailed report below about the performance impact of this PR. The TL;DR is that although ast conversion is taking up a surprisingly high chunk of the overall parsing time, the net result is in the few percent difference and thus in the noise threshold, given that I also had net improvements on some runs. I think this is thus reasonable to move forward, and the more of the pipeline we'll migrate to the new AST, the better the performance, given that the new AST is smaller and has better locality.